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Abstract 

In this paper, we briefly review the current literature on treatment integrity and discuss the relevance of this 
procedure for detecting, measuring and ensuring that the proposed mechanisms of change in cognitive behavior 
therapy, in this case of acceptance and commitment therapy (ACT; S.C. Hayes, Strosahl, & Wilson, 1999), take 
place. We discuss aspects such as how to develop an integrity coding system that takes into account the nuances of 
the ACT model, critical factors when deciding the different processes to include in the coding manual, and 
suggestions for how to operationalize the distinction between adherence and competence from an ACT perspective. 
In addition, we also provide more specific guidance about issues as how to select the segments to be coded, the 
training of those who will code the intervention, and the interrater reliability process. Finally, we provide the reader 
with a complete example of a treatment integrity coding manual that was specifically adapted for a randomized 
controlled trial in the treatment of obsessive compulsive disorder (Twohig et ah, 2010a). The aim of the current 
paper is to provide some essential tools for ACT researchers to develop treatment integrity protocols so that they are 
more encouraged to adopt such methods in their studies. 
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Treatment integrity as it relates to psychotherapy outcome research refers to the degree to which a 
particular intervention is implemented in a competent manner with fidelity to the theoretical model and 
the specific processes and procedures specified in the treatment protocol (Nezu & Nezu, 2008). 

Treatment integrity typically involves two processes, adherence or fidelity to the manual, protocol, or 
treatment model, and competence or level of skill with which therapists deliver the specified treatment 
(Waltz, Addis, Koemer, & Jacobson, 1993). These constructs are related in that in order for a treatment to 
be delivered competently, adherence to the treatment is implied, but adherence to the treatment does not 
necessarily imply competence (Waltz et al., 1993). Treatment integrity is primarily viewed as a way to 
conduct a manipulation check to ensure that a treatment has been implemented appropriately. Without 
such assurance, treatment effects cannot be linked to the specific processes purported in the treatment 
model to be related to change. A well designed, reliable treatment integrity manual may also allow 
comparisons of treatments across settings, comparison of therapists across settings and studies, and 
provide important information for training and supervision procedures (Waltz et al., 1993). Integrity 
checks are also vital for randomized clinical trials, as an important tool to discriminate between different 
treatments (Kazdin, 2003; Nezu & Nezu, 2008). There is a small, but growing, body of literature in 
psychotherapy research that highlights the importance of assessing therapist adherence and competence in 
psychotherapy outcome studies and has examined these processes as predictors of outcomes themselves 
(e.g., Barber, Foltz, Crits-Christoph, & Chittams, 2004; Barber et al., 2006; McGlinchey & Dobson, 

2003; Perepletchikova, Treat, & Kazdin, 2007). 

The relationship of adherence and competence to outcome is complex and conflicting. While 
some studies have shown that these variables predict positive outcomes (Barber, Crits-Christoph, & 
Luborsky, 1996; Barber, Liese, & Abrams, 2003; Carroll et al., 1998, 2000), meta-analyses of treatment 
outcome studies indicate that on the whole, they do not predict outcome (Webb, DeRubeis, & Barber, 
2010). Such findings may highlight the importance of common (or nonspecific) factors, such as 
therapeutic alliance, treatment credibility and client’s expectations given that a previous meta-analysis has 
indicated that common factors might be more influential for outcome across studies than treatment- 
specific factors (Wampold et al., 1997). However, Webb and colleagues postulate several reasons for their 
null findings, given that some individual studies have demonstrated direct significant effects of adherence 
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on outcome (both in relation to and separate from therapeutic alliance). First, they postulate that the 
significant heterogeneity of effect sizes for adherence and competence in their analysis makes interpreting 
a nonsignificant mean effect size difficult, and that such disparate effect sizes between studies may be due 
to the fact that the methods used, problems treated, treatments applied, and the outcomes themselves were 
widely varied between studies in the meta-analysis. Webb and colleagues also postulate that, given 
evidence that certain methods within treatments tend to relate better to outcomes than others (e.g., 
problem focused cognitive therapy techniques; DeRubeis & Feeley, 1990; Feeley et al., 1999), the effect 
of adherence to effective components of treatments may be masked by adherence to less helpful 
components. 

Other researchers have attributed such lack of impact of treatment-specific processes on treatment 
outcomes in the cognitive-behavioral literature to the fact that only a minority of studies have actually 
assessed treatment integrity (Bhar & Beck, 2009; Perepletchikova, Flilt, Chereji, & Kazdin, 2009; 
Perepletchikova et al., 2007, 2009) and when assessed, there is often an unclear delineation or ineffective 
measurement of purported mechanisms of change (Kazantzis, 2003) and inadequate methodologies are 
often employed (Perepletchikova & Kazdin, 2005). Despite the difficulties and null findings, treatment 
integrity is viewed as a vital part of empirical validation of psychotherapy research and measurement as 
evaluating the implementation of interventions allows treatment development researchers to increase the 
efficiency and effectiveness of available therapies by identifying effective and ineffective processes and 
adequate doses of treatment required for certain outcomes (Kazdin, 2007; Nezu & Nezu, 2008; 
Perepletchikova et al., 2007). 

Taken together, these data indicate two areas of further development for treatment outcome 
researchers. First, we postulate that if even smaller outcome study researchers were to conduct treatment 
integrity (TI) coding, and did so using methods that have been identified as useful in other outcome 
studies, such efforts would go a long way towards providing fruitful information in the literature as to the 
importance of adherence to treatment-specific processes. Such procedures could provide a tool for 
discriminating between outcomes (both positive and negative) where there was demonstrated adherence 
to important processes and where there was not, thereby pointing to either a problem with the processes 
themselves or to problems with training or adherence to them. Therefore, we provide a brief overview of 
considerations for planning to conduct TI coding and methods for doing so, as well as citations for further 
reading as needed. Second, researchers can better identify those processes or components of treatments 
that are likely to lead to positive change through basic, analog, component analysis or dismantling studies 
and separate those from the components that are unlikely to lead to change (e.g., see Addis & Jacobson, 
1996 for an excellent example). ACT researchers have conducted several studies examining ACT 
processes as mechanisms of change, and we present some of these data below. There is more work to be 
done, but the work so far is promising in support of many of the ACT processes as not only mechanisms 
of change, but also as mediators of outcome across a wide array of populations. 

Acceptance and Commitment Therapy and Treatment Integrity 

ACT (S.C. Hayes et al., 1999) is a form of cognitive-behavior therapy that has proposed a series 
of alternative processes of change in contrast to other models within the cognitive-behavioral tradition. 
ACT is a contextual behavioral approach that seeks to change the function of undesired experiences 
(including thoughts, feelings and emotions) by changing the contexts in which they occur, rather than 
their form or frequency. This is done by the use of experiential exercises and metaphors (changing the 
verbal context in which these events occur). As such, ACT is a principle-based treatment that seeks to 
influence context and function and that results in the development of psychological flexibility. ACT 
proposes that psychological flexibility, or the ability to persist in valued life activities in the face of 
distressing or unwanted private events, is the general process of change responsible for improvements in 
outcome (S.C. Hayes, Luoma, Bond, Masuda, & Lillis, 2006). The six processes that work together to 
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produce psychological flexibility are acceptance of private experiences, defusion from the literal 
functions of thoughts, an awareness of the self-as-context or an observing self, contact with the present 
moment, clarity of personal values, and committed action to live consistent with those values. A more 
detailed description of the ACT model can be found elsewhere (e.g., S. C. Hayes et al, 2006). 

ACT has placed special emphasis on defining, clarifying, and testing its purported processes of 
change. Such focus on processes of change has been an important part of studies employing ACT for a 
variety of psychological problems. A recent meta-analysis indicated that laboratory, component, and 
outcome studies provide a base of evidence that one or more of the processes of change in ACT are linked 
to outcomes of interest (S.C. Hayes et al., 2006). This is important, because mediational analysis, in 
which processes of change are examined statistically to account for the change seen in outcomes, is 
another way to establish empirical support for particular treatment-specific processes and there has been a 
shortage of studies including such methods within the CBT literature at large (Kazdin, 2001, 2007; Stout, 
2007). A large number of ACT studies to date have included such analyses, which cohere with the 
processes of change emphasis of this model. 

Specifically, changes in acceptance have been shown to mediate outcomes in studies of worksite 
stress (Bond & Bunce, 2000), psychosis (Gaudiano & Herbert, 2006), anxiety and depression (Forman, 
Herbert, Moitra, Yeomans, & Geller, 2007), diabetes management (Gregg, Callaghan, Hayes, & Glenn- 
Lawson, 2007) and obsessive-compulsive disorder (Twohig, Hayes, & Plumb, 2010). Changes in values 
mediated outcomes for epilepsy (Lundgren, Dahl, & Hayes, 2008). Overall psychological flexibility 
mediated outcomes for chronic pain (Wicksell, Olsson, & Hayes, 2010), and acceptance, mindfulness, 
and values each mediated outcomes for generalized anxiety disorder (S.A. Hayes, Orsillo, & Roemer, 
2010). In some studies, ACT processes mediated outcomes and outperformed the processes of change put 
forth in alternative treatment models (e.g., Twohig et al., 2010b; Wicksell et al., 2010). 

These mediation analyses provide evidence that the specific processes employed in ACT and 
other acceptance and mindfulness-based treatments are likely important mechanisms through which 
desired clinical change occurs. Given the potential positive impact of employing empirically validated 
processes, we must also assess that these processes are indeed being employed as intended in 
psychotherapy outcome studies. Therefore, it behooves clinical researchers to ensure that ACT treatments 
are employed with fidelity to the model and to develop and/or utilize appropriate methods to assess 
treatment integrity. 

However, we recognize that there may be significant barriers to implementing a TI protocol for 
psychotherapy researchers in general and ACT researchers specifically. In a recent survey, researchers 
reported the most common barriers to implementing integrity checks in psychotherapy outcome studies 
were (a) lack of understanding of the processes responsible for change, (b) paucity of labor and funding to 
develop and implement such procedures, and (c) few professional journal editor and granting agency 
requirements for implementing such procedures, even though TI was viewed as an important component 
of outcome research for many of those surveyed (Perepletchikova et al., 2009). ACT, as a relatively new 
treatment, is still in its early stages of empirical evaluation, and as such has relatively fewer studies 
funded or funded at a high level (which in turn could provide the funds and personnel to appropriately 
apply TI procedures) as compared to other, more established treatments (Gaudiano, 2009). Despite this 
disparity, ACT researchers may have somewhat of an advantage in that there are a number of measures 
developed for ACT that assess its purported mechanisms of change and many of these have shown to be 
predictors and mediators of outcome. While these processes of acceptance, mindfulness and values are 
still being evaluated, some have been shown to mediate outcomes across a wide range of psychological 
difficulties, as is the case of experiential avoidance. 
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This is encouraging, and may serve to reduce the burden on ACT researchers for identifying 
theoretically-consistent processes of change. Our goal in this paper is to help reduce some of the 
additional burden of developing an ACT integrity coding system by sharing the strategies that have been 
employed in recent ongoing clinical trials and providing additional suggestions for clinical researchers to 
develop their own TI protocols that fit their research questions. 

Developing an ACT Integrity Coding System 

Nezu and Nezu (2008) provide extensive guidelines for developing and implementing an 
integrity protocol, of which the following are particularly relevant for ACT studies: (a) treat integrity as 
an integral part of the study, influencing study design and hypothesis generation, protocol development, 
and therapist training and supervision; (b) develop a treatment manual with integrity issues in mind, 
specifying treatment ingredients, intervention structure, therapist behaviors, and examples of competent 
and incompetent behaviors, differentiating behaviors by context (e.g., behaviors that are more relevant at 
different times in treatment than others), building in flexibility (particularly when client difficulties 
supersede the planned course of a given session), and continuously revising both the treatment and 
integrity protocols until they are in sync, (c) identify and operationalize the elements of treatment, 
including conceptually critical processes, that are crucial for integrity assessment, (d) develop a 
standardized procedure for integrity raters, and (e) ensure that the method of measuring integrity is 
standardized, independently rated, possible within the confines of the study (e.g., videotaping may make 
for different observable behaviors than audiotaping), and maps onto the particular questions of interest 
(e.g., the way in which researchers operationalize and measure adherence and competence may influence 
the ability to assess their relation to outcome, particularly when delivering certain treatment components 
that are conceptualized by the researchers as key for changes in outcomes). 

This last point is particularly relevant for ACT researchers as the treatment model and 
mediational analyses conducted within the literature do indeed indicate that particular processes are likely 
to be responsible for changes in outcomes. Therefore, considering the treatment components that are 
hypothesized to be potential moderator and mediator variables can and should influence both TI protocol 
development as well as the treatment manual development. 

The greatest difficulty with developing a coding system to assess the integrity of ACT is that 
ACT, by its very functional, principle-based nature, does not lend itself easily to rigidly manualized or 
scripted treatment protocols that define therapist behaviors topographically. While some treatments may 
prescribe a set of procedures or techniques that should occur in a particular order, with a particular 
frequency or duration, and during a particular session in the course of treatment, ACT is typically less 
procedural, instead suggesting a range of exercises and metaphors that aim at a particular function based 
on a series of behavioral principles. In addition, while there are some recommendations that particular 
techniques or exercises occur earlier in treatment and others later in treatment, there is nothing within 
ACT itself that requires adherence to these suggestions, particularly if clinical choices are employed 
based on an ideographic functional case conceptualization. Depending on the goals of the researcher, the 
funding status of the project, and the experience of the therapists, the ACT treatment protocol may be 
more or less detailed, with varying degrees of flexibility in ACT process focus or procedure from session 
to session. 

Therefore, adherence to ACT treatment protocols must be assessed from a functional perspective. 
Coding therapist behaviors in this way requires clear examples for observable behaviors on the part of the 
therapist, and while function may seem to be an unobservable behavior at times, we will suggest ways in 
which the function of a particular therapist behavior can be identified from statements that the therapist 
makes and the larger context of the therapeutic interaction. This can make for a more difficult coding 
atmosphere, which we will discuss in the section on creating a coding team. 
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We cannot stress enough the need to state clearly in a written coding manual both the items to be 
assessed for integrity as well as a rubric for coding them. Doing so may require a team of individuals with 
some experience conducting ACT treatment, such as postdoctoral level therapists, additional members of 
the treatment team, or collaboration with other colleagues. While developing a coding manual in this 
manner may be difficult and time-consuming for smaller and underfunded studies, there are existing 
coding manuals (including the one in the appendix of this paper) that can be used as a starting point for 
developing an individualized coding manual that includes items of importance for a particular study. 
Having a coding rubric allows coders to have a clear understanding of the meaning of a particular rating 
(be it a dichotomous rating of “present” versus “not present” or a more detailed Likert scale) and to make 
comparisons between coders. We will present recommendations for coding procedures including the 
number/percentage of sessions to code and training coders to use the manual in a later section of the 
paper. 


The strategies and suggestions supplied in this paper are based on our work on three funded 
clinical trials, although the coding procedures employed in these studies included both graduate and 
undergraduate research assistants who were successfully trained to reliability. First, many of the 
suggestions in this paper come from R.V.’s work helping to develop and train coders to distinguish 
between ACT model processes and two other intervention models in a large randomized controlled trial 
for reducing stigma and burnout in substance abuse counselors (e.g., S. C. Hayes, Luoma, Kohlenberg, 
Vilardaga, Lillis, et al., 2009). In this study the authors found good interrater reliability scores (.85) in 
coding ACT workshops. Notice that in this case, the adherence coding procedure was conducted in the 
context of the delivery of ACT in a group format and not at an individual level. Second, the sample 
coding manual provided in the appendix was utilized in J.P.’s work on a recent funded clinical trial of 
ACT for OCD (Twohig et al., 2010a). Positive outcomes for the ACT intervention were reported, and 
integrity coding indicated that each of the identified ACT process occurred at the highest level during at 
least one session of treatment and that there was no observed occurrence of processes that were expressly 
prohibited in the ACT treatment manual. Further, this study reported high interrater reliability scores (.94) 
between coders at the same institution, and acceptable interrater reliability scores (.80) between coders 
across sites. Prior to its modification for the OCD study, a version of the sample TI manual was 
successfully used in both small, unfunded studies as well as large, funded studies in our research 
laboratory (e.g., Gifford et al., 2004; Twohig, Hayes, & Masuda 2006). 

Coding Adherence to ACT Processes 

Choosing the coding unit is an important first step to developing a coding manual. The entire 
session may be rated or observations may be chunked into smaller segments (e.g., say within a day-long 
workshop) depending upon the researcher’s study and questions of interest. Whichever unit of 
measurement is chosen, the researcher should also consider the best method of rating the observations - 
to allow multiple processes to be coded in a given observation or to force a choice of the primary process 
that occurred in a given instance. 

Given the fact that the processses in the ACT model are often interrelated and presented together, 
we recommend designing a coding system that allows for rating all processes along a continuum (e.g., for 
frequency, simple occurrence, or depth) in a given observation rather than forcing a primary process to be 
coded. There are instances in which a single-item code (e.g., wherein coders are asked to select a primary 
process for that observation, to the exclusion of other processes) may be more in line with a particular 
research question, such as in component analysis studies or investigations that aim to test order effects for 
components in ACT. 

Coding for adherence to the therapeutic model requires a focus on coding the observable behavior 
of the therapist. How might one code observable behavior when the goal of coding is to assess the 
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function of the response? Take for example a therapist making the following statement, “Just thi nk 
something else!” to a client in session. Taken alone, this statement might appear to be indicative of a 
positive thinking strategy, and inconsistent with ACT. Rather than make inferences about what the 
therapist’s intention was in making such a statement (e.g., perhaps the therapist said the statement in what 
one could construe as a sarcastic tone of voice), the coder should rely only on the observable behavior of 
the therapist — by assessing the preceding or subsequent statements and other observable behaviors in the 
room. Suppose this therapist followed that statement with “So, how successful were you?” Now the 
function of the statement “Just think something else” is a bit clearer — it was an attempt to examine the 
workability of attempting to control one’s thoughts. In another scenario, perhaps a client said in response 
to “Just thi nk something else”, “I’ve tried that, and it doesn’t work very well” and the therapist replied, 
“Exactly. So what else could you do?” In this second example, the statements the therapist makes would 
both be indicative of workability/control as the problem. Note that in both examples, the coders do not 
need to consider what the therapist’s “intention” might have been — the observable behavior of the 
therapist tells a clear story for the purpose of the statements. 

Additionally, coding focuses on the observable behavior of the therapist, and not the client’s 
behavior or responses. For example, a therapist would be coded as adherent if they present the material 
accurately to clients, even if the client expresses confusion or displays disruptive behavior in the room. 
How well the therapist presents the material and the therapist's responses to client's confusion or other 
concerns is best considered under the competency rating. 

In addition to coding the therapist’s introduction of therapeutic material, the therapist facilitation of 
client's behaviors can be considered for coding as well. For example, suppose a client was to say, “I am 
definitely noticing my mind telling me that I'm useless a lot. I'm thanking my mind for that thought”, the 
therapist might continue facilitating defusion by saying “Great, let's practice thanking your mind for some 
of your other sticky thoughts. When else do you notice your mind chiming in with critical statements?” In 
this example, while the client initiated defusion activity, the therapist facilitated it in the room following 
the client's defused statement. Notice that what is coded in this instance is the therapist’s behavior and not 
what occasioned it. 

In considering the options for developing a coding rubric, Waltz and colleagues (1993) indicate 
that the simplest coding for a variable of interest is a dichotomous rating of occurrence versus 
nonoccurrence, but recommend coding the frequency with which the variable occurred during an 
observation as a potentially more useful method. Given the global nature of the ACT processes, it is more 
appropriate in our view to examine the frequency with which ACT processes are employed in a given 
observation rather than simple occurrence, as it will provide a more in-depth account of the dose of ACT 
being applied. For example, a coder may observe that an ACT therapist employed an acceptance exercise 
for 5 minutes during a session, and mark that “acceptance” occurred, but perhaps the therapist spent the 
remaining 45 minutes simply listening to a client speak without responding in a particularly ACT- 
consistent way (e.g., simply responding with statements such as “Yeah, tell me more.” or “Sounds 
difficult.”) as opposed to responding more actively from an ACT stance (e.g., “Tell me what you have 
done when you noticed that feeling showing up. Has it worked?” or “So this is one experience that you’ve 
had difficulty making space for. What’s the cost of trying not to have it?”). Were this same session to be 
coded using a frequency rating system, acceptance might be coded as occurring with low frequency, 
lessening the ACT-consistency of the session. 

In addition to frequency coding, we also advocate that an adherence coding system examine the 
depth or extensiveness of each occurrence as well. In our experience, a therapist may bring up values 
several times throughout a session (e.g., saying things like “It sounds like you really care about your 
family”, “One cost of avoiding anxiety is not doing things you like to do”, or “What if we could move 
toward what you care about no matter what shows up inside your skin?”) but not going into great detail 
about those values. Alternatively, they may discuss values extensively two or three times during a 
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session (e.g., conducting overt values clarification or using the recommended metaphors/exercises for that 
session) and both could be considered highly adherent ratings. 

Identifying ways to recognize subtle moves in the room is an important part of capturing the 
richness of the interaction between the therapist and client (or audience). For example, defusion may 
occur as many brief instances in the session rather than as any one overarching metaphor or time- 
consuming exercise, and as such it can require a more nuanced awareness of responses indicative of 
defusion. It can be difficult to design an integrity coding manual that accounts for both gross instances of 
a therapist's behavior as well as these more nuanced or subtle instances; subtle discriminations are more 
difficult to train and are subject to interpretation. Even when all coders have acknowledged that some 
processes occurred in a particular segment, there may be disagreement about which particular process or 
processes took place. It is in the nature of subtleties that they are difficult to perceive or agree upon, and 
this is something that the researcher needs to take into account when designing a particular coding manual 
as well as planning for a coder training procedure. 

Deciding which facets of the protocol to include in a coding manual. An obvious starting 
point is to include the six processes in the ACT psychological flexibility model; acceptance, defusion, 
self-as-context, present moment awareness, values, and committed action. However, coding manuals are 
better at capturing behaviors of interest if they include specific strategies or techniques of importance to a 
particular study. The iterative process between a treatment protocol and the TI protocol is especially clear 
here. The research questions of the study may shed important light on the treatment processes and 
procedures that will be important to assess carefully in a TI protocol. Whichever processes and/or 
techniques are of the highest importance should be included in the integrity coding manual. Particularly 
when researchers plan to disseminate the protocol, a more specific coding manual will help therapists in 
different settings stay closer to the intention of the protocol. 

For example, the coding manual in the appendix includes items that assess control as the 
problem, creative hopelessness strategies, and the idea of workability. While not being the main ACT 
processes, these strategies are key in establishing acceptance, defusion, and so on as the alternatives to 
unhelpful control or avoidance agendas and may therefore be important to measure. Alternatively, if a 
treatment protocol was heavily values focused, the coding manual could include several additional facets 
of values-related therapy behavior such as clarifying values, conducting values-related experiential 
exercises, assigning and troubleshooting values-related homework exercises, and utilizing clinical tools 
and values measurements in and out of session. 

Additionally, if a study includes features of the treatment protocol that are integral for the 
treatment delivery but that go beyond the ACT processes themselves, it is important to operationalize and 
include them as well. For example, in another coding manual designed to assess adherence to ACT in a 
RCT for helping addiction counselors overcome barriers to engaging with difficult clients and reducing 
burnout (S. C. Hayes, Luoma, Kohlenberg, Vilardaga, Lillis, et al., 2009), the authors created a coding 
manual that included key processes or aspects of each of the four treatment conditions tested in the trial. 
The result was a coding manual that included aspects such as psychoeducation about burnout, recognizing 
stigmatizing attitudes towards difficult clients, and awareness of our own cultural bias. Such codes were 
key to accurately represent all the different components of the treatment study and verify that each 
treatment arm was reliably different treatment from the others. 

Discriminating between processes across treatment models. A vitally important function of a 
TI protocol is to enable researchers to discriminate between treatment models (Perepletchikova & Kazdin, 
2005). Discriminating between treatment models is clearly important for any study in which two or more 
treatment models are directly compared, but discriminating between treatment models can be important to 
establish the treatment of interest (e.g., ACT) as distinct from other models, even if no direct comparison 
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to that model is part of a particular study. This is especially important for any modality within the family 
of CBT treatments, including ACT, because if ACT constitutes a refinement of techniques and proposed 
processes of change as compared to other forms of CBT, this should be able to be shown reliably through 
an adherence coding procedure. This aspect is a key factor for treatment development in CBT. 

Nezu and Nezu (2008) also propose that a clear TI protocol is even more important when the 
comparative treatments are more similar in process and procedure than they are different. For example, a 
study comparing traditional CBT to ACT will require a more nuanced TI coding manual to establish key 
differences (e.g., between cognitive defusion versus cognitive restructuring), whereas a study comparing 
relaxation training to a relationship-focused treatment may only require a fairly simple set of 
discriminations. Particularly when comparing two similar treatments such as CBT and ACT, we 
recommend cross-coding for key processes of change so as to ensure that cognitive therapy strategies are 
not present in ACT treatment sessions and vice versa. 

While discriminating between the comparative treatments utilized in a randomized trial is 
important, it is also important to assess that therapists avoided processes that were explicitly excluded in 
the treatment, even when there is no comparison group utilizing that model in a particular study. Such a 
strategy is important for establishing evidence that ACT, particularly when applying it to a new treatment 
population, is distinct from existing treatments. For example, in order to distinguish ACT as a possible 
efficacious treatment that utilized processes different than those already found effective in exposure with 
response prevention (ERP), a recent study purposely excluded in-session formal exposure in the treatment 
manual (Twohig et al., 2010a) and as such an ERP item was included in the integrity coding to ensure that 
this variable did not occur in the treatment sessions. 

While a therapist might adhere to the use of ACT processes throughout the therapeutic process, 
he or she might also frequently use therapeutic strategies that encourage patients to challenge the content, 
frequency or intensity of their thoughts and feelings, or other therapy moves that are antithetical to ACT. 
Therefore, it is often just as important to establish that therapists avoided the use of these prohibited 
strategies or rationales. This provides a means of further explicating ACT processes as distinct from 
processes in other models as well as highlighting the unique features of the treatment. In coding for 
adherence to the ACT model, such antithetical items can include using experientially avoidant change 
strategies (e.g., “stay away from situations that make you scared, and you'll feel a lot better”), reassuring 
statements directed at reducing or removing the client’s experiences in the moment (e.g., “I know you’re 
feeling sad now, but everything is going to be just fine,” or “you don’t have to feel bad about that 
anymore, it wasn’t your fault.”), or the idea that thoughts and feelings cause behavior (e.g., “When you 
think you are a bad person, you feel sad, and then you stay home. So we're going to help you deal with the 
thought that you're a bad person so that you can go outside more.”). These examples of behaviors are 
ACT-inconsistent in that they are direct attempts to change the form, frequency or intensity of thoughts 
and feelings so that clients will feel better and/or will live better lives. ACT-consistent behaviors would 
be to present acceptance, defusion, present moment, or self-as-context strategies in order to help the client 
choose more values-consistent behaviors even when uncomfortable experiences are present. 

Overall adherence to the protocol. In addition to ACT-specific processes or procedures, there 
are likely other specific procedures or topics that the researchers wish to ensure happen in each session or 
in specified sessions including symptom assessments, specific and/or regular homework assignments, 
homework debriefing, inside and outside of session experiences/exposures, and so on. These items can 
either be included under adherence to the treatment protocol, or if they are recommended to occur in 
specific sessions, it should be specified when they should occur. 

Overall adherence to the protocol may be more of a global assessment when the treatment 
protocol is written as a flexible protocol as is often the case in ACT studies. In that case, the TI coding 
manual should ideally specify the parameters within which the therapist's behavior is adherent to the 
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protocol. When developing parameters for coding therapists delivering a flexible protocol, the parameters 
should be as specific as possible while still encompassing the desired flexibility intended in the protocol. 
Perhaps the protocol specifies the processes that should take primary precedence within the early therapy 
sessions, the middle sessions, and the later sessions; perhaps there are certain metaphors or exercises that 
should occur within the treatment at some point; and perhaps therapists were expected to conduct in¬ 
session exercises and assign homework based on the process that the therapist choose to focus on within 
each session. By including language in the coding manual that addresses these parameters, adherence to 
the treatment protocol can be better assessed. 

In previous studies, we have found it useful to set a particular standard for an appropriate amount 
of ACT processes being applied throughout the study. For example, OCD researchers identified that for 
the treatment to be applied with fidelity to the model, each ACT process assessed in the coding manual 
should be rated at the highest level at some point across sessions, indicating that that process occurred 
with great frequency and depth during at least one session in treatment (Twohig et al., 2006, 2010a). Also 
in these studies, there was high adherence to the treatment protocol, as certain processes tended to occur 
with high frequency at the same approximate time into treatment (e.g., early, middle, and later sessions) 
across sessions and therapists and this pattern fit the treatment protocol. Other standards could be 
developed that fit each study’s goals or design (e.g., looking over each case for levels of different 
processes, etc.). While previous ACT studies point to a probable effective treatment dose for different 
problems as defined by number of sessions (e.g., S. C. Hayes et al., 2006), this is only a rough estimate of 
hours of global intervention and provides little guidance for what is considered “enough” ACT to affect 
change. That is a question that can only be answered through further study. 

Operationalizing Competence 

There is general agreement in the literature that both adherence and competence are important for 
TI. Adherence is thought to preclude competence, but is not sufficient for competence (e.g., Waltz et al., 
1993). For instance, even if therapists deliver treatment in a way that is adherent to the procedures in the 
protocol, they may do so in an incompetent manner that will threaten the internal validity of the study and 
limit the interpretation about observed outcomes (Perepletchikova et al., 2007). Therefore, we suggest 
that researchers include at least one competence item in the TI coding manual whenever doing so is 
feasible. Competence can be considered globally, such that it is assessed at the end of a treatment session 
or observation instance. In the case in the sample TI manual provided in the appendix, there is an ‘overall 
competence’ item that was coded for each therapy session observed. 

Three simple features provide an excellent foundation for assessing competence. First, how 
consistently did the therapist address the client’s needs? Second, how consistently did the therapist attend 
to the client’s responses to treatment targets? Third, how clearly and in-depth did the therapist apply the 
procedures outlined in the treatment manual? The sample TI manual in the appendix includes a rubric for 
using these three features of competence to identify competence on a five-point Likert-type scale. 

Opportunities for a more fine-grained analysis of competence are also possible, although to our 
knowledge no ACT studies have yet examined them specifically. First, learning how to avoid common 
therapy pitfalls are often described as key competencies for therapists training in ACT (e.g., S.C. Hayes et 
al., 1999; Luoma, Hayes, & Walser, 2007) and assessing for such moves could provide an additional level 
of detail regarding competent ACT delivery. Specific behaviors that are considered ACT-inconsistent are 
convincing or lecturing the client that the procedures in the ACT model are correct or useful, pushing the 
client to experience certain feelings in the room without obtaining permission, or prescribing certain 
personal values or value-directed behaviors. At another level, researchers may be interested in 
specifically assessing the competence with which therapists in a study deliver particular processes, 
although such coding may require a higher level of skill and familiarity with the ACT model to do well. 
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Perhaps the goal of a study is to assess the competence with which a therapist or therapists tend to 
demonstrate a particular process (e.g., defusion) or conduct a particular procedure (e.g., creative 
hopelessness). Identifying such process- or procedure-specific strengths and weaknesses could then 
provide much needed data for planning future training, dissemination, or assessment of therapists across 
sites. 


Competence and the ACT therapeutic posture. While the operationalization of competence 
above is quite comprehensive for most studies and global competence is likely the easiest to code to 
reliability, there are other features of ACT that relate to competence such as the ACT therapeutic posture. 
The therapeutic relationship or alliance has shown to be an important component of therapy, and is one 
that has been appealed to in the common factors literature (Wampold et ah, 1997). This literature has 
emphasized the importance of the therapeutic alliance as a measure of the working relationship and 
collaboration between therapist and client in the pursuit of mutually agreed upon therapeutic goals 
(Messer & Wampold, 2002; Wampold, 2005), but in ACT, the therapeutic relationship is taken one step 
further to include a therapeutic posture specific to the ACT model (Vilardaga & Hayes, in press). ACT 
trainers and manuals typically suggest that there is a therapeutic posture or stance from which therapists 
should work, such that the ACT processes are modeled and instigated by therapists themselves throughout 
treatment. It may be important for the goals of a study to assess the degree to which the therapist 
embodies an ACT therapeutic posture, meaning that they respond to the client from an accepting, defused, 
and value-directed stance. As a result, the therapeutic relationship in ACT is built upon the therapists’ 
facilitating, modeling, and instigating of the specific skills (e.g., present moment awareness, value- 
consistent living) that are explicitly taught, through experiential exercises and metaphors in the treatment 
(Pierson & Hayes, 2007; Vilardaga & Hayes, in press). While the explicit behavioral skills training 
component of the treatment can be more clearly captured in adherence, as of yet this other important 
feature of the ACT model has not been assessed or coded in any study. Future studies that attend to these 
processes both in relationship to treatment outcome and within TI coding would be a welcome addition to 
the literature. 

Coding Procedure 

There are a number of considerations for developing a coding procedure such selecting sessions 
to code, the number of coders to select, whether the coders should be blinded to treatment condition, and 
the best way to train, assess, and supervise the coders. In many cases, two coders are sufficient, but 
additional coders may be preferred to accommodate the amount of coding. 

Nezu and Nezu (2008) propose that researchers should weigh the time and cost associated with 
coding all sessions conducted with the importance of establishing that the treatment was applied exactly 
as planned across all sessions. Given the amount of skill and time it can take to train coders to carefully 
code each therapy session, coding all sessions for integrity may not be feasible for unfunded or studies 
with limited financial or personnel support. However, because so very few RCTs actually employ TI 
procedures at all (Perepletchikova et al., 2009), even in well-funded CBT trials (Bhar & Beck, 2009), we 
propose that coding even a percentage of treatment sessions for integrity will increase the methodological 
rigor of any treatment outcome study. 

Selection of sessions to be coded. In the case that the researchers cannot afford or choose not to 
code all sessions conducted for integrity, the selection of sessions to be coded is an important process and 
one that is not always done entirely at random in studies that assess integrity. Why? First, therapists are 
rarely equally talented at applying the protocol throughout the study; they very often get better as they 
have seen more clients (particularly when the study utilizes novice or trainee therapists) and as they grow 
accustomed to the protocol. Second, it can be important for gleaning a representative sample of therapist 
behavior in applying the entire protocol to assess the therapists equally throughout the early, middle, and 
later sessions across clients. Webb and colleagues (2010) report that for some studies in their meta- 
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analysis, TI coding consisted of coding one session of each therapist’s treatment as “representative” of 
that therapist’s behavior. While any integrity coding is laudable, coding only one session per therapist is 
unlikely to accurately capture all relevant variables of interest (at least for some treatment modalities) and 
Webb and colleagues (2010) postulate that such a procedure could have contributed to their null findings. 
For ACT studies, where the intervention is likely to look very different in an early versus a later session 
(e.g., different processes are likely to be targeted), it is even more important to consider coding more than 
one session as representative. Even if selecting a few sessions at random, the sessions selected for coding 
could be skewed to include a greater number of certain sessions than others. This could result in a greatly 
unrepresentative sample of therapist behavior, as oversampling certain sessions decreases the likelihood 
of assessing the therapists' full repertoire throughout the course of treatment. Therefore, it is important to 
block the random selection of sessions to be balanced across early, middle, and later sessions, as well as 
across the therapist's earlier clients and their later clients (Nezu & Nezu, 2008). 

Establishing interrater reliability. Interrater reliability (IRR) establishes the degree to which the 
coders’ ratings agree, and anything above 80% agreement is generally considered sufficient, with 
reliability of at least 90% being ideal. To assess (IRR) the coders each assess the same therapy sessions 
and their ratings are compared to each other. Calculating an IRR statistic is one way to establish that the 
coders have all been properly trained. Early in the coding training process the coders may have a low 
IRR but as their codes are discussed and coding discrepancies are minimized, the IRR should increase. 
While establishing good IRR before beginning the official coding process is important, so too is 
periodically assessing each coder over time to minimize coder drift. Therefore, we recommend creating 
an IRR coding schedule that includes a planned overlap between raters at the beginning of their official 
coding, in the middle, and towards the end of the sessions they will code. In selecting the sessions to be 
coded for IRR, it can be useful to apply a similar strategy as recommended above. Consider selecting 
earlier, middle, and later sessions across the therapists' time in the study as sessions to be coded for IRR 
so that the coders are able to learn from a fairly representative sample of therapist behaviors throughout 
the course of therapy. We recommend seeking additional reading for statistical and methodological 
suggestions appropriate for the study in which IRR will be conducted (e.g., Tinsley & Weiss, 2000). 

Training coders. There are several features of a coding training procedure that may facilitate an 
effective and efficient coding procedure. First, if possible, it is important to blind the coders to the 
treatment model being applied in any given session. This may not be feasible in the case where 
videotaping occurs or the treatment providers are clearly linked to one treatment or another. Flowever, 
whenever possible, it may allow for a more objective assessment of both prescribed and proscribed 
behaviors across sessions. Second, the assembling of a team to code sessions is important. In the 
aforementioned studies from which we draw our experience, an ideal coding team includes members who 
have some exposure to functionaFbehavioral thinking and know or are willing to learn the basic 
principles of ACT, but being a therapist is not necessarily required (e.g., we have successfully trained to 
reliability a range of coders including undergraduate research assistants). Novices of the ACT model may 
require outside reading to establish a foundation of knowledge about the model and behavioral thinking 
and in such cases we recommend incorporating into coder training readings from such texts as The ABC’s 
of Human Behavior (Tomeke & Romnero, 2008), Learning ACT (Luoma, et al., 2007), ACT Made Simple 
(Flarris, 2009), or A CBT-Practitioner’s Guide to ACT (Ciarrochi & Bailey, 2008), depending on the 
potential coder’s background. Third, successfully training coders to apply accurately a complex TI 
protocol involves regular meetings and multiple shaping opportunities. We recommend selecting training 
segments that a lead trainer and the coding trainees can watch/listen to together and discuss the 
discriminations as they arise. Fourth, we recommend assessing the coders periodically (particularly if the 
coding process takes place over an extended period of time) to ensure that their ratings have not drifted 
too far from the original intentions in the integrity coding system. Assessing for coder drift can also be an 
opportunity to further hone the integrity coding manual to help clarify any points of confusion or 
ambiguity. Finally, in our experience, psychological flexibility on the part of the integrity coding team is 
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as important as psychological flexibility on the part of the therapist or client. When coders experience a 
coding item as ambiguous, confusing, or worded such that it conflicts with the intended meaning, it is 
easy to disagree with other team members. In that sense, disagreeing on specific therapy observations can 
be a frustrating process and thus as a team leader it is important to foster psychological flexibility in terms 
of the coder’s own reactions and their reactions to others. In addition, the integrity manual should be 
adapted to better explicate the process of interest, even when the authors of the manual believe the 
wording to be clear. 


Summary 

Including an integrity protocol can provide additional empirical evidence for the model being 
used, which can be of the highest importance when a study is designed to treat a novel population or 
apply a new technique. While mediation analysis can provide empirical support for the purported 
processes of change, mediation analysis does not indicate whether or not the procedures/exercises used in 
the treatment were responsible for the changes in the mediator and outcome. As such, researchers must 
also conduct at minimum a manipulation check to ensure that the procedures were employed as directed 
by the treatment protocol. Together, treatment integrity checks and mediational analysis can provide 
strong empirical evidence that the processes of change are indeed responsible for desirable changes in 
outcome. 

The strategies presented in this paper are designed to be a starting point for researchers 
developing a study-specific TI ACT protocol. However, we recognize that many of our suggestions 
represent a best-case scenario for coding for TI. Admittedly, our experience in coding for ACT processes 
comes exclusively from federally-funded studies with multiple personnel. We hope that these suggestions 
are helpful not only for those with the financial resources to conduct carefully developed integrity checks, 
but for those conducing smaller, unfunded studies to incorporate some level of integrity coding. While the 
field as a whole is lagging behind the need for such integrity checks, we hope that papers such as this will 
help provide the necessary tools for ACT researchers to consider developing and employing integrity 
checks in their studies more regularly. 
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